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Abstract 

It is well known that rather general mutation-recombination mod- 
els can be solved algorithmically (though not in closed form) by means 
of Haldane linearization. The price to be paid is that one has to work 
with a multiple tensor product of the state space one started from. 

Here, we present a relevant subclass of such models, in continuous 
time, with independent mutation events at the sites, and crossover 
events between them. It admits a closed solution of the corresponding 
differential equation on the basis of the original state space, and also 
closed expressions for the linkage disequilibria, derived by means of 
Mobius inversion. As an extra benefit, the approach can be extended 
to a model with selection of additive type across sites. We also derive a 
necessary and sufficient criterion for the mean fitness to be a Lyapunov 
function and determine the asymptotic behaviour of the solutions. 



Key Words: population genetics, recombination, nonlinear ODEs, 
measure-valued dynamical systems, Mobius inversion 

MSG 2000: 92D10, 34L30 (primary); 37N30, 06A07, 60J25 (secondary) 



1 



Introduction 



The basic mechanisms which create genetic variation in biological evolution 
are mutation and recombination. They are counteracted by selection, which 
removes variation. Genetic information may be quite generally described in 
terms of a collection of linearly ordered sites (i.e. a sequence of sites), each 
of which is occupied by an element of a given (finite or infinite) set which 
we denote as site space; if this set is finite, it is often termed alphabet. A 
specific sequence is also called type. 

Mutation is treated as a random state change of a site variable, which 
occurs independently at every site. Recombination occurs on the occasion 
of sexual reproduction, and refers to the creation of 'offspring' sequences 
from two (randomly chosen) 'parental' ones, where a subset of the 'maternal' 
sites is combined with the complementary set of the 'paternal' sites, and 
the linear ordering along the sequence is maintained. This process is realized 
through one, or a number of, crossover events, where the two parental strands 
are interlaced between a pair of neighbouring sites. An important feature 
of recombination is that it removes dependencies between sites, known as 
linkage disequilibria in genetics. Finally, selection is caused by the flourishing 
of fit individuals at the expense of less fit ones. 

We consider an infinite population of sequences which evolves under the 
joint action of mutation, selection or recombination, or of any combination 
thereof. This is to be considered as the infinite population limit (IPL) of the 
stochastic process alluded to, and defines a deterministic dynamical system 
for probability measures (in discrete or in continuous time). It describes 
the time evolution of the measure with probabilistic certainty, see |21, Ch. 
11], and Thm. 2.1 of it in particular. Although there are many interesting 
and important questions connected with finite populations, we focus on the 
differential equation of the deterministic limit here, which we will call IPL 
equation from now on. In particular, we will not employ the traditional 
discrete dynamical systems, but follow the continuous route along the lines 
of Kimura |^ and Akin , which happens to be much less developed than 
it ought to be, see also for a recent review. 

Mutation is a linear process and straightforward to deal with. Selec- 
tion involves some nonlinearity, which is due to norm conservation under 
the dynamics, but this nonlinearity may be removed through a simple trans- 
formation. Recombination contains a very different source of nonlinearity. 
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which is due to the fact that pairs of objects are involved in the process, 
and is much harder to treat. Nevertheless, if both state space and time are 
discrete, a procedure (known as Haldane linearization, see |T^, [T^ and 



36| , Ch. 6]) is available which transforms the dynamical system (exactly) into 



a linear one. It involves a multilinear transformation of the probabilities to 
a new set of variables, namely certain linkage disequilibria, which describe 
the deviation from statistical independence of sites. These variables decay 
independently and geometrically, i.e. they decouple and diagonalize the dy- 
namics. Unfortunately, however, the procedure is cumbersome since it relies 
on recursions, and no closed form is available for the transformation in the 
general case. 

In a previous paper Q, the special case of single crossover events was 
considered, where offspring sequences are composed from one maternal and 
one paternal segment. This scenario is particularly relevant in molecular evo- 
lution, where crossover events are rare, and it is most consistently described 
in continuous time. For discrete site spaces, and with the help of the corre- 
sponding vector space structure, the linearizing transform could be given in 
closed form with the help of elementary methods from multilinear algebra. 

The aim of this article is to further develop this approach in a systematic 
measure-theoretic setting which also incorporates more general site spaces 
and does not require an explicit coordinatization. We will essentially start 
from the deterministic IPL equation and construct its solution explicitly, first 
for recombination only. The so-called Mobius inversion principle will then 
give a simple approach to the calculation of a suitable (and, in particular, 
complete) set of linkage disequilibria. It will then turn out that mutation 
and even selection may be included in the framework, provided fitness is 
additive, meaning that the fitness of any type may be decomposed into a 
sum of independent contributions of its individual sites, i.e. if there is no 
interaction between sites. Such results may be helpful for the solution of the 
corresponding inverse problem, i.e. the determination of recombination rates 
from experimental data, e.g. observed patterns of linkage disequilibria along 
sequences |jl5|, 

The exposition will be more explicit than needed for a purely mathemat- 
ical audience, and we also try to give rather precise references to background 
material we use. We hope that the article will become more self-contained 
this way and that it is also accessible for readers with a more biological 
background. 
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The structure of the paper is as follows. After some preliminaries in Sec- 
tion [I|, we will briefly summarize the description of mutation through an IPL 
equation on the space of positive measures in Section followed by some 
general remarks on measure-valued IPL equations. The core of the article is 
Section where we solve, step by step, the IPL equation for recombination 
and construct an explicit solution of the abstract Cauchy problem, together 
with a closed form of the corresponding linkage disequilibria. The latter is 
based on an application of the inclusion-exclusion principle via Mobius inver- 
sion (a supplement is given in the Appendix). Section]^ combines mutation 
and recombination. Section ^ deals with selection and recombination, with 
some emphasis on the role of mean fitness as a Lyapunov function. Finally, 
Section |] ties together all three evolutionary forces — still giving an explicit 
solution, expressions for the linkage disequilibria, and asymptotic properties. 
We close with some afterthoughts mainly aimed at the relationship to models 
in discrete time. 



1 Preliminaries 

If X is a locally compact space (by which we always mean to include the 
Hausdorff property), we use A4+{X) to denote the set of finite positive reg- 
ular Borel measures on X, with G Ai+{X). Likewise, A4{X) is the vector 
space of real (or signed) finite regular Borel measures. It is a Banach space 
under the norm ||a;|| = |ci;|(X) where \uj\ denotes total variation measure. 
Due to the Riesz-Markov representation theorem, Ai {X) can also be viewed 
as the dual of Coo(-^, IR), the Banach space of real- valued continuous func- 
tions which vanish at infinity, equipped with the usual supremum norm, see 
3|, Thm. IV. 18], as well as ^ Ch. 6] and |9|, Ch. IV.4] for general back- 



ground material. Note that Ai{X) with the variation norm ||.|| is actually 
a Banach lattice, and this gives access to the highly developed theory of 
positive operators |^, We will mainly be interested in the closed convex 
subsets M']^{X) := {uj G M^X) \ uj{X) = m}, and in V{X) = MX{X) 
in particular, the set of probability measures on X. Note that, for positive 
measures u, we simply have ||a;|| = uj{X). 

If the Borel a-algebra of X is generated by a family of sets that is closed 
under finite intersections, a regular Borel measure on X is already uniquely 



4 



specified by its values on the elements of this generating family |35|. This 
is a property that we will need several times, in particular if X = Xi x X2 
is a product space, equipped with the product topology. 



Fact 1 Let u, v' he two regular Borel measures on the locally compact product 
space X = Xi X X2 which coincide on all '^rectangles" Ei x E2 where Ei 
and E2 each run through the Borel sets of Xi and X2. Then v = u' , i.e. 
u{E) = u'{E) for all Borel sets E of X. 

Proof: In view of the above remark, the only obstacle to cope with is 
the (non- vacuous!) situation when the a-algebra generated by the rectangles 
El X E2 is not the full Borel cr-algebra of X. However, the cx-algebra generated 
by the rectangles contains all Baire sets F of X, because the Baire sets of 
X possess the required Cartesian product property Lemma 56.2], and the 
Borel sets of Xj contain the Baire sets of Xj. The equality of u and u' now 
follows from p, Thm. 62.1] (this rests on the fact that every Baire measure 
has a unique extension to a regular Borel measure). □ 

Standard examples of locally compact spaces include the compact ones, 
such as any finite set or the closed interval [0, 1], but also M'^ and with 
k,i > 0, and arbitrary combinations thereof. These certainly cover all mean- 
ingful parameter spaces to be expected in biological applications. 

If X is a finite set (which is an important case in population genetics), 
V{X) is a simplex. If the cardinality of X is M, this simplex has dimension 
M — 1, i.e. any probability measure is a unique convex linear combination 
of the M extremal measures that constitute the vertices of the simplex. If 
X = {1, . . . , M}, they are denoted by e^, z = 1, . . . , M, and fixed by their 
values on singleton sets, e^{{j}) = 5ij. In other words, any oj G V{X) is of 
the form u = X]f=i ^i^i with all > and + . . . + = 1. This provides 
the canonical coordinatization of this situation. 

The set (or state space) X that we need will have a product structure, de- 
scribed on the basis of sites. For later convenience, we use N = {0,1, ... ,n} 
for the set of sites, i.e. we start counting with here. To site i, we attach 
the locally compact space Xj, and our state space is then 

X = XoxXiX...xX„, (1) 

which is still locally compact. One Banach space of measures to show up 
is the space Ai{X) with the corresponding variation norm ||.||. Note that 
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A4{X) contains the (algebraic) tensor product space A4'^ := (S^ILo -^("^«)' 
and also its completion (here, the closure in the given ||.||-norm of Ai{X)). 
To simplify notation, the latter will also be denoted hj Ai^, because we shall 
only deal with Banach spaces here. Recall that Ai® contains the product 
measures u = Uq ^ ■ ■ ■ ® with G Ai{Xi), but also all (finite) linear 
combinations of measures of this kind. Because we consider the completion, 
also all measures are contained which can be approximated with such linear 
combinations in the norm. All probability measures of product form are in 
this space, but note that the single measures in the product need not be 
probability measures themselves. 

If Xi = {1, . . . , Mi} is finite, for all < i < n, X is still a finite set, with 
M = nr=o elements. Then, Ai{X) = Ai^, and this is simply a real vector 
space of dimension M. Ai{X) = A^® is also true for X discrete. In this case, 
the action of operators in tensor product form is well defined. In general, 
if Ai'^ C Ai{X), one can still go beyond Ai^ under certain circumstances, 
e.g. by including integrals (rather than finite sums) of product measures. 
However, we do not want to enter this rather technical discussion, and refer 
to |3^, Ch. IX. 6] and Ch. IV. 7] for some background material, and to 



TBI, Ch. 13] for some of the problems that are related to these difficulties. 

X finite is the case most frequently studied in the theory of sequence 
evolution, and it was the motivation for this work, see ^ and references given 
there. However, many results hold in greater generality, which we want to 
cover in view of potential applications in quantitative genetics. There, the 
space Xi often is a state space such as M, or a compact subset thereof. In this 
case, A4'^ is a true subspace of Ai{X), which has to be taken care of later 
on (occasional restrictions of X to a finite set will be mentioned explicitly). 

The main reason for using the above set of sites is that we will need 
ordered partitions of A^, which are uniquely specified by a set of cuts or 
crossovers. The possible cut positions are at the links between sites, which 
we denote by half- integers, i.e. by elements of the set L = {i, |, . . . , ^^^}- 
We will use Latin indices for sites and Greek indices for links, and the implicit 
rule will always be that a = is the link between site i and i + 1. 

With this notation, the ordered partitions of A^ are in one-to-one corre- 
spondence with the subsets of L as follows. If A = {a^, . . . , Op} C L, let A^ 
denote the ordered partition 

{0, . . . , [a^\ } , { [ail , . . . , [aaJ } , • • • , { l^p] ,•••,«} 
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where [aj ( \a] ) denotes the largest integer below a (the smallest above 
a). In particular, we have N^^ = N and A'^^: = {{O}? • • • ; {^}}- With this 
definition, it is clear that is a refinement of A^^ if and only if A C 
B. Consequently, the lattice of ordered partitions of corresponds to the 
Boolean algebra of the finite set L, denoted by B{L), cf. [0, Ch. 1.2]. We 
prefer this notation to that with partitions, as it is easier to deal with. If 
A (Z B, we will write B — A for B\A, and A for the set L — A. 

This setup allows us to use the powerful tool of Mobius inversion from 
combinatorial theory [0, Ch. IV. 2], which is a systematic way to employ the 
inclusion-exclusion principle. If / and g are mappings from B{L) to M which 
are, for all A G L, related by 

g{A) = J2 fiB) , (2) 

BcA 

then this can be solved for / via the inversion formula Thm. 4.18] 

f{A) = Y.g{B)i^{B,A) (3) 

BcA 

with the Mobius function fi{B,A) = (—1)''^^'^', where |y4— i?| stands for the 
cardinality of the set A — B. For B not a subset of A, we set fi{B,A) = 
which makes the Mobius function into an element of the so-called incidence 
algebra, see Ch. IV. 1] for details. It is important to note that Mobius 
inversion is not restricted to functions, it also applies to bounded operators. 



2 Mutation and Markov generator 

The description of mutation is rather straight-forward. Let us start from a 
finite population. Since we are working in continuous time, we assume an in- 
dependent Poisson clock for each individual member of a (finite) population, 
and a mutation occurs for an individual whenever its clock rings, according 
to prescribed mutation rates between (finitely many) types or states. Since 
the individuals are independent, this is a simple Markov process for each of 
them. If we now go to the infinite population limit, the time evolution of 
the probability measure for the types is, almost surely, described by a (de- 
terministic) ordinary differential equation (ODE). This is the so-called IPL 
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equation, compare Thm. 11.2.1] for a general justification, which we will 
also rely on below. For the simple mutation case, this ODE is linear. It 
clearly coincides with the ODE for the probability measure of the individual 
Markov process, usually obtained from multiple realizations through the law 
of large numbers. 

Let us consider the case that X is a finite state space of cardinality 
|X| = M in more detail, where dim]g(A^(X)) = M. The mutation rate from 
state i to state k is given by Q^^^ = Q^^i, where we already consider Q as a 
mapping acting on the corresponding probabilities, resp. measures. The rate 
matrix Q is a Markov generator, i.e. it has non-negative entries everywhere 
except on its diagonal, and vanishing column]^ sums. The time evolution is 



then fully described by the Markov semigroup {exp{tQ) \ t > 0}, see |21 
Ch. 1.1 and Ch. 4.2]. We shall usually assume that Q is irreducible, i.e. it is 
possible to reach every state from any other one. In this case, the equilibrium 
state is unique and given by the properly normalized 0-eigenvector of the 



generator Q. It can actually be given in closed form, see Lemma 6.3.1]. 

If X has the product structure introduced above, our mutation process 
is supposed to be of a more special form, for biological reasons. We assume 
that mutation happens at all sites in parallel and independently from one 
another, so that our generator has the form 

n 
1=0 

where each is, in a properly coordinatized way, the tensor product of a 
rate matrix at site i and unit matrices of matching dimension everywhere 
else, i.e. 

Qi = Ia/o ® ■ ■ • ® Ia/.-i ® ® lAf,+i ® ■ ■ ■ ® 1a/„ (5) 

where is a local rate matrix (of dimension Mi) for the state space Xi, 
acting on Ai{Xi). The rate matrices clearly commute with one another. 
Note also that Q of (^) is irreducible if and only if all the q^ are. The Markov 
semigroup inherits the tensor product structure, i.e. we have 

n n 

exp(tg) = J]exp(tg,) = (g)exp(tgj. (6) 

i=0 i=0 



^In contrast to the standard probability literature, we adopt the transposed version 
here since we are considering the situation from the (linear) operator point of view. 
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In view of our following description of recombination, we prefer to avoid 
an explicit coordinatization here, so we will not use matrix notation. This 
simply means that we have to reinterpret the generator Q as a linear operator 
on Ai{X). Nothing of the above actually changes, we only have to read Q 
(or as a linear mapping on Ai{X) (or on A^(Xj)). The two conditions for 
Q to be a Markov generator now read as follows (the analogous conditions 
apply to gj in relation to M.{Xi)). 

1. If z/ is a positive measure and E any Borel set such that i^iE) = 0, then 
{Qu){E) > 0. 

2. If z/ is a positive measure, then (Qz/)(X) = 0. 

The first condition ensures that the semigroup generated by Q maps A4 + {X) 
into itself. Under the present circumstances, where Q is bounded and Ai (X) 
is a reflexive Banach space, this condition is necessary and sufficient for 
the positivity of exp(tQ), see [^, Thm. 1.11]. It is sometimes also called the 
positive minimum principle. The second condition means that the semigroup 
is Markov, i.e. it preserves the norm of positive measures, and, in particular, 
maps V{X) into itself. In this setting, irreducibility implies that the kernel 
of the Markov generator Q is one-dimensional. 

The IPL equation for our simple mutation process^ now reads 

n 

d; = $^,,(u;) := ($^Q.)u; (7) 

which we will take, in generalization of the discrete situation, as the starting 
point for the analysis of mutation, without tracing it back to an explicit 
stochastic process. We then obtain, by employing standard results 0| from 
the theory of ordinary linear differential equations in (finite-dimensional) 
Banach spaces (see also Theorem |I| below): 

Proposition 1 The abstract Cauchy problem of the IPL equation ([^ with 
initial condition uJq G V{X) has the unique solution 

n 
1=0 

which is, fort >0, a one-parameter family of probability measures. □ 

^In this linear case, the IPL equation is closely related to the master equation commonly 
used in the physics literature, see p^, Ch. 5] for details. 
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To formulate a generalization of Prop. |1], let us forget about the product 
structure for a moment and consider the linear ODE 



uj = Qio 

with Q the generator of a uniformly (or norm) continuous Markov semigroup 
on Ai{X), compare [jl9|, Ch. 1.3]. This is the case if and only if the linear 
operator Q, in addition to satisfying assumptions 1. and 2. from above, is 
bounded, and hence defined on all of A4{X), see |]T9|, Cor. II. 1.5]. In partic- 
ular, we can then write the semigroup in exponential form |T^, Thm. 1.3.7], 
i.e. as exp{tQ), and the solution as = exp(tQ) uJq. In what follows, we will 
(non-constructively) assume that a process is given that leads to a bounded 
generator Q which is a linear operator on Ai{X), i.e. maps regular Borel 
measures to regular Borel measures. As long as this is the case, it is suf- 
ficient to work with assumptions 1. and 2., even if the space of measures 
considered is no longer reflexive. The analogue of Prop. |l] then holds on the 
Banach subspace A^*^, to which we shall restrict our attention whenever Q 
is of the form specified in Eqs. and (|^). This makes no difference at all 
as long as X is discrete. 

Many results can still be generalized to densely defined generators of 
strongly continuous semigroups, see [^, Ch. 1.5], but already the well-posed- 
ness of the Cauchy problem needs some thought, compare |l^, Ch. II. 6] for 
a discussion. Also, the characterization of generators for positive semigroups 
becomes more involved, see p, Ch. 3]. Usually, one would then rather de- 
scribe the entire process by means of semigroups on function spaces, compare 
pT| , Ch. 1.4]. Since all explicit mutation schemes we have in mind lead to 
uniformly continuous semigroups, we will not expand on the more general 
situation. 

Let us instead add a few remarks on the general type of IPL equation 
that arises when recombination and selection are also included. This will 
also better explain our formulation of mutation, from the point of view of 
measure-valued differential equations. In what follows, it is sufficient to in- 
vestigate the first order ODE 

dj = <l>{uj) (8) 

on the Banach space Ai{X), where $ is a mapping from Ai{X) into itself 
(alternatively, we can study (|^) on any closed subspace of A4{X) that is 
invariant under $). Unlike ^^^^ from (|^, $ need not be linear, and it is the 
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nonlinear cases below that we are most interested in. The three properties 
we will meet below are: 

Al The mapping $ is (globally) Lipschitz. 

A2 If u E i.e. u is a positive measure, and E any Borel set such 

that u{E) = 0, then we have ($(z/))(E) > 0. 

A3 For any u e M + {X), we have ($(//)) (X) = 0. 

It is clear that our formulation of mutation constitutes a linear example of 
such a mapping. 

Theorem 1 // $ : M{X) M{X) satisfies (Al), the abstract Cauchy 
problem of the ODE (^, with initial condition uJq G M.{X), has a unique 
solution. If $ also satisfies (A2), the cone A^+(X) of positive measures is 
invariant under the semiflow for t > {in other words, M.^{X) is positive 
invariant) . Finally, if $ also satisfies (A3), the norm of positive measures is 
preserved in forward time. In particular, the convex set V{X) of probability 
measures is then positive invariant. 

Proof: If $ is Lipschitz, we can invoke the Picard-Lindelof Theorem for 
ODEs on Banach spaces, see [|, Thm. 7.6], so existence and uniqueness of 
the solution of the abstract Cauchy problem are clear. 

If $ also satisfies (A2), positive invariance of A1_|_(A) follows from a 
continuity argument, see p. 235 and Thm. 16.5 together with Remark 16.6 of 
m for a proof. If $ is linear, (A2) is the so-called positive minimum principle, 
and our assertion also follows from Thm. 1.11], which uses a functional 
analytic proof. 

Finally, assume $ satisfies (Al) - (A3). Let uj^ e Ai^{X) be the initial 
condition and denote the corresponding unique solution of (^ by cu^. Then, 
uj^ E A1+(A) for alH > by the previous argument, so \\uj^\\ = u!^{X). This 
implies ^HcuJ = ($(co'j)(A) = by assumption (A3), so \\iu^\\ = WlUqW = m. 
This proves the assertion. □ 
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3 Recombination 



This section deals with the nonhnear IPL equation for recombination, and 
is the core of our article. We develop the results step by step here. The 
combination with mutation will then be rather painless, and an addition of 
selection will be discussed after that. 

3.1 Recombination on measures 

Let X, Y be two locally compact spaces with attached measure spaces Ai{X) 
and Ai{Y). If / : X ^ F is a continuous function and uj G M.{X), then 
f.u := oj o is an element of A4{Y), where f~^{y) := {x E X \ f{x) = y} 
means the preimage of y E Y in X, with obvious extension to f~^{B), the 
preimage of a subset B G Y in X. Due to the continuity of /, f~^{B) is a 
Borel set in X if i? is a Borel set in Y . 

Let X = Xq X ... X Xn be as in Section ^ and let, from now on, N 
and L always denote the set of sites and links as introduced there. In this 
section, we can entirely work with the Banach space Ai{X), equipped with 
the variation norm ||.||. Let vTj : X — Xj be the canonical projection which 
is continuous. It induces a mapping from J\4{X) to A^(Xj) by u; ^— vr^.d;, 
where {n-.u!){E) = u {tt^-^ (E)) , for any Borel set E C X,. By (slight) abuse 
of notation, we will use the symbol tt^ also for this induced mapping. It is 
clear that tTj is linear and maps positive measures to positive measures of the 
same norm. As such, it is bounded and hence also continuous. In particular, 
it maps V{X) to V{Xi) and may then be understood as marginalization. 
Likewise, we can start from any (ordered) index set J C X and define a 
projector ttj : M{X) M{Xj) with X/ := X^g^Xj. With this notation, 
Xat = X. We will frequently also use the abbreviation n^^ for the projector 
^{1 [aj}' ^'^^ ^{[a] n}' These objects may be understood as 'cut 

and forget' operators, since they give the distribution of what is left after a 
cut is made at a, and the trailing resp. leading segment is discarded. 

This now enables us to introduce the elementary recombination operator, 
or recombinator as we will call it from now on, : Ai{X) M.{X), for 
a E L. If = 0, Ra{uj) := 0, and otherwise 

Ra{uj) := -l-{{7r^^.u)^{n^^.u;)) (9) 
II II 

which is a (partial) product measure. Here and in what follows, we tacitly 
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identify (if necessary) a product measure with its unique extension to a regu- 
lar Borel measure on X, which is justified by Fact The following property 
is now an immediate consequence of the definition. 

Fact 2 The recombinator maps Ai + {X) into itself and preserves the 
norm of positive measures. In particular, it maps V{X) into itself. □ 

Let us comment on the choice of @. Being composed of the cut-and- 
forget operators for the leading and the trailing ends, Ra{'^) has the inter- 
petation of a 'cut-and- relink operator', which describes a cut at a, followed 
by (random) reunion of the resulting segments. 

At first sight, it might appear more natural to drop the prefactor l/HtuH. 
However, the norm of a positive measure uj would then not be preserved 
unless ||cc;|| = 1. In view of later extensions, it is more desirable not to be 
restricted to probability measures, and that is why we prefer (^) which makes 
Ra positive homogeneous of degree 1, 

Ra{auj) = \a\Ra{uj), (10) 

for arbitrary a G M. Note, however, that R^ is not a linear operator, not 
even when restricted to M. + {X). 

Fact 3 Let a E L. The recombinator R^ satisfies \\Ra{uj)\\ < \\uj\\, for all 
UJ G JiA{X), and is (globally) Lipschitz on Ai{X). 

Proof: Let us first observe that, for arbitrary uj,uj' G Ai{X) and a E L, 
we obtain the inequality 

IIKa-^)® Ka-^OII < Ikllll^'ll, 

which is a simple consequence of Hahn's decomposition for real measures, 
see Thm. 6.14], applied separately to the factors of the product measure. 
For 7^ u; G Ai{X), we then have 

\\u\\ 

with equality for positive measures, as stated in Fact |^. Clearly, we also have 
Ra{0) = 0, so that the first assertion follows. 

Let uj,Lj' G A4{X). If one of them is the 0-measure, say u' = 0, we have 
||i?Q,(u;) — Ra{uj')\\ = \\Ra{uj)\\ < \\uj\\ = \\uj — u'\\. So we may assume both 
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UJ and uj' to be different from and hence to have positive norm. With the 
above inequahties, we can now employ the following Se-type argument 



\RJuj) - RJuj') 



+ 



ll^ll 
1 1 



+ 



\UJ' 



< 



ci;|| \\u'\ 



\UJ 



+ 



1 



< 2 — cj'll + I ||c<j|| — lltu'll I < 3 ||c<j — . 

Together, this gives the second assertion, with Lipschitz constant < 3. □ 

In view of Fact |^, it makes sense to investigate the properties of the 
recombinators restricted to the positive cone M. + {X). The crucial property 
which underlies our later analysis is the following. 

Proposition 2 The elementary recombinators, when restricted to M.j^{X), 
are idempotents and commute with one another. In other words, we then 
have -R^ = and RaRp = RpRa for arbitrary a, P E L. 

Proof: The statement is trivial for the action on z/ = 0. So, let z/ > be a 
(strictly) positive measure. We then have i^iX) = \\u\\ and obtain 



WW i^>a-'^) 



in A4^{X^^) resp. A4+{X^^) where we adopt the same index convention for 
sets as we did for projectors. Using ||-Rq,(z^)|| = from Fact and the 
linearity of the mappings u (tt.i/), one can now apply the definition of the 
elementary recombinators to check explicitly that 



R^{R^iu)) = R^ 
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For commutativity, we may again assume u > and also a < (3. Then 



7r^^.{{7r^f^.iy) ® {TT^f^.u)) = ||z/|| (tt^^.z/) 
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The first equation can be verified directly, as in the previous case. The second 
can easily be checked on Borel sets of the product form E = E^^^-^ x -E^^, 
followed by an application of Fact |I]. Combining these intermediate results, 
one obtains 

which proves our assertion. □ 



Remark: In view of positive homogeneity of the recombinators, see Eq. (10), 
it would have been sufficient to prove our assertions on V{X). The above 
version, however, shows quite clearly where, and how many, normalization 
factors ||z/|| appear in the tensor products. If we restrict ourselves to proba- 
bility measures below, one should keep this in mind for extending arguments 
to the full cone, M+{X). 

A close inspection of the proof of Proposition ^ shows that we have si- 
multaneously proved the following useful property. 

Lemma 1 Let v G ViX) and a E L. For all (3 E L with (3 > a, we have 
'^<a-{Rp{^)) = '^<a-^- Similarly, vr^„. (-R^(z^)) = tt^o-z^, /or all jS <a. □ 



3.2 The IPL equation and its solution 

Let us start with a brief description of the recombination process for finite 
X, and a population of m individuals, each of the form x = {xq, x^, . . . , a;„) 
with Xi G Xj. Every individual carries a Poisson clock at each link a G L, 
with parameters > 0, which do not depend on the individual. If the 
clock at link a of the individual x rings, a random partner y is picked from 
the population for recombination at that link. The recombined pair is then 
{xq, . . . , a^i^^j, •••,!/„) and {yQ, . . . , 2/^^^]' • • • '^n)- 

To describe the entire population, let Zx{t) be the random variable that 
gives the number of x-individuals at time t, and Z{t) the combined random 
vector with components Zx{t). Hence, if Z{t) = z, and x ^ y, we can 
have transitions from z to z — — u„ + u,^. , \ + u, ^. n, where we 
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use our short hand notation for indices, and to denote the unit vector 
corresponding to x. Such a transition occurs at rate Q^z^Zy/{rn — z^). 

Note that this process imphes instant mixing of all (geno-)types in the 
population. This is an idealization which neglects that maternal and paternal 
genes stay together for the lifetime of an individual. Nevertheless, this is a 
good and realistic model if recombination events are rare on the time scale of 
the individual life span. This is certainly true if our sites belong to the DNA 
sequence of a single gene, or a few adjacent genes. It is then well justified to 
describe recombination in terms of these first order effects only. 

Let us look at the influence of increasing m, whence we write 
indicate dependence on system size. As m ^ oo, the sequence of random 
processes Z^"^^{t)/m converges almost surely to the solution of a differential 
equation with initial condition Z^"^\0)/m (resp. its limit as m — > oo), see 
2l| , Thm. 11.2.1]. The corresponding IPL equation P, Eq. 2.5], reformulated 



in our measure-theoretic setting, reads 

u = := J2g^{R^-l){u). (11) 

In line with our strategy for the mutation processes, we take this nonlinear 
ODE as the general starting point for the recombination analysis on product 
spaces X built from arbitrary locally compact spaces Xj. We will assume that 
> 0, for all a E L, without loss of generality (if = 0, remove the link at 
a, absorb the pair ([aj, \a]) into a single site, and identify X^a] x X^a] with 
the state space at that site, thus reducing the number of sites (and links) by 
one). 

Proposition 3 The abstract Cauchy problem of the IPL equation ( [TT| ) has a 
unique solution. Furthermore, A4+{X) is positive invariant under the flow, 
with the norm of positive measures preserved. In particular, V{X) is positive 
invariant. 

Proof: Consider u = ^^.^^{u!), which is a special case of (|^), so we want 
to apply Theorem ^ By Fact ^, ^^.^^ is Lipschitz, so assumption (Al) is 
satisfied. 

Let z/ e M+{X), i.e. ^{E) > for all Borel sets E C X. Let E be any 
Borel subset of X such that i^{E) = 0. Then 



<l>,MiE) = Y.g^Ro.{u){E) > 
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because each Ra{i^) is a positive measure and all > hj assumption, so 
(A2) is satisfied. 

Finally, with Fact ^ it is easy to check that $j,p^(z/)(X) = for any posi- 
tive measure u, which shows that assumption (A3) is satisfied, too. Theorem 
ID then establishes our claims. □ 

The difficulty in solving (^) stems from the nonlinearity of the right- 
hand side, so cannot be considered as the generator of an exponential 
semigroup. It is, however, rather natural to expect that the solution should 
still have a rather similar structure, as the Ra are at least positive homo- 
geneous of degree one and commute with one another. Let us therefore, for 
any G (Z L, introduce the composite recombinators 

rg ■■= n^"- (12) 

They are well-defined on Ai+{X) due to Proposition ^ while an order of 
the product has to be specified otherwise. In any case, ||i?(^(u;)|| < ||a;|| for 
all uj G A^(X). Note that = 1 and R^a} ~ this notation. The 

composite recombinators are again positive homogeneous of degree one. A 
simple induction argument based on Proposition ^ gives the following result. 

Corollary 1 On 7V1_|_(X), the composite recombinators satisfy 

^G^H — ^GUH ' 

for arbitrary G,H G L. Furthermore, each Rq maps M.j^{X) into itself and 
preserves the norm of positive measures. □ 

Let us pretend for a moment that the idempotents R^ were actually linear 
operators. In such a case, we would get 

exp(^^t(i?^ - 1)) = exp(-^^t)l + (1 - exp(-^^t))i?„ . 

Taking the product over such terms for all a G L and expanding it would 
formally lead to the sum 

^ a(.{t) Ra 

GCL 

with the coefficient functions 

= ( n ^^p(-^?-^)) ■ (n (i-^^p^-^?/^'^)))- (13) 
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It will have a touch of magic below when we prove that this little "deriva- 
tion" actually gives the correct answer! After we have established our main 
result in Theorem we will come back to these coefficients and give them 
a probabilistic interpretation. This will also motivate why they are a very 
reasonable guess to start with. 

As mentioned before, the elementary recombinators are not linear. Nev- 
ertheless, they have a related property on convex combinations. If a; = 
J2i=i ^i^i is a convex linear combination of positive measures z/j of equal 
norm, we get 

k 

R^{uj) = Y,a,R^{u,) + m (14) 
1=1 

where one can show, by a rather straight-forward calculation which we omit 
here, that the remainder is given by 



This shows that the recombinators are indeed inherently nonlinear, but also 
that they might act like linear operators on special convex combinations, 
namely those for which the remainder vanishes. This is precisely what we 
need to solve our problem. 

Proposition 4 Let u be a positive measure, a E L and aQ{t) the coefficient 
functions of (pISl). Then, for any fixed t > 0, we have 

GcL GcL 

Before we prove this result, we formulate a special property of the coef- 
ficient functions first. Observe that, for fixed t > 0, := exp(— ^^^.t) is a 
number between and 1. It can be interpreted as a probability (namely that 
link a has not been hit until time t). With this, the coefficients read 



where we have suppressed the (fixed) time, but added the set of links, L, as 
an upper index. We can now formulate a crucial factorization property. 
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Lemma 2 Let L = L1UL2 be a partition of L, and set Gi = G (1 Li for 
an arbitrary G C L. Then, the coefficients of (|r^) satisfy Oq = ■ a^^. 
Furthermore, for any L' C L, we have 

HgL' 

Proof: Since Li = 0, the first statement is a direct consequence of the 
product form of Oq in Eq. (pISl). The normahzation property can be verified 
from the probabihstic interpretation mentioned above. If 1 — (resp. q^) is 
the probabihty that hnk a has (resp. has not) been hit, is the probabihty 
that, of the hnks in L', precisely LL is spared. Consequently, ^^cl' '^h 
is the sum over the probabilities of all possible events, hence equal to 1. 
Alternatively, this identity can be derived from a simple Mobius inversion 
argument, as we show below in Fact ^. □ 

Proof of Proposition |^: Since the recombinators are positive homoge- 
neous of degree one, it suffices to prove the statement for u a probability 
measure. Let a G L be fixed. 

Set u = J2gcl^g ^ci^)- Since z/ G Vi^X) implies uj G Vi^X), we obtain 



GcL ^ HCL 

G,HCL 

where we have used the linearity of the mappings n^^ and 7r^„. 

Let us define Li = {|, |, . . . , a} and L2 = L — Li, so that L = L1UL2 
is a partition of L. Also, let Gi = G Ci Li and LLi = LL (1 Li, for G,LL (Z L. 
Lemma |l] then tells us that 

Inserting this into the previous equation and invoking Lemma p| repeatedly 



R^iu) = R^{Y.^gRg{^ 

GcL 

^ nr- T / ^ Ur- T / 
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gives 



G,HCL 

= E E E E 4\4i4\^H\Ra{RG,uHM 

GiCLi G2CL2 HiCLi H2CL2 

= E E «Ra{RG.UH2i'^)) 

GiCLi H2CL2 

= J2 RaiRA'^)) , 

KCL 

which proves our assertion. □ 
Remark: Proposition ^ admits the following interpretation. Let u he a 
positive measure, with ||z/|| = m > 0. Then, the 2'^' measures -Rg(z^) with 
G G L form the vertices of a ||.||-closed simplex in Ai^{X). On some of their 
convex combinations (in particular along solutions, as we will see shortly), 
the elementary recombinators act linearly. It is this simplex, foliated into 
solution curves, to which the entire time evolution is constrained, with u as 
the initial condition. 

The positive measure in Proposition ^ was arbitrary. This means that, 
when restricting the action of the -R^'s to Ai+{X), we can formulate the rule 
on the level of operators. Observe that Ra{RKi^)) = RKu{a}i^) = Rk{i^') 
where u' = Ralu). By a simple induction argument, we thus arrive at 

Corollary 2 Let aQ{t) he the coefficient function of (|13]), and let t > be 
fixed. On Ai-^{X), the recombinators satisfy the equation 

^/^(E^gW^g) = E^gW^guh 

GcL GcL 

for arbitrary H G L. □ 

We now assume that the initial condition, Wq, is a positive measure and 
make the following ansatz for the solution of (0): 

= E^gW^gK) (16) 

GcL 

with the coefficient functions aQ{t) of (pISl). Note that they do not depend 
on uJq. The initial values are ^^(O) = 1 and 0^(0) = for all ^ G G L. 
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By Corollary [1|, each Rq{uJq) is a positive measure with the same norm as 
uJq. This implies that, as long as aci^) ^ t > 0, the ansatz for must 
form a convex linear combination of positive measures of equal norm if it is 
a solution of (|^). This follows from Eq. (^) together with Lemma ^, or from 
Fact H below. 

The time derivative of of (|16D is = '^gcl ^ci'^) ^ci^o) ■ 
other hand, Proposition ^ means that the act linearly on the convex 
combination (p!6D, and we obtain 

= SI 5Z "g(^) {RGU{a} (^o) - ^g(^o)) 

«gl gcl 

aeL aeGcL ai^GcL 

= Yl [Y^o.(^G\{a}it) - X1^/3"gW]^gK) ' 

GCL a&G f3(zG 

where we use the notation G in the third step to indicate the summation 
variable. It is now a straight-forward calculation to check that the coefficients 
ttQit) of ([T3|) indeed satisfy the equations 

aeG i3^G 

and that they constitute a convex combination in (|l^). Consequently, our 
ansatz solves the IPL equation ([TT|), and, by Proposition H, this is the unique 
solution we are after. We have thus established the following main result. 

Theorem 2 The ansatz (|I6|) solves the IPL equation ( ]Tl| ) with initial con- 
dition G Ai^{X) if and only if the coefficient functions are given by (|T^), 
i.e. by 

aa{t) = exp(^~Y ^j) " H - exp(-^^t)) 

oeG /3eG 

for all G C L. □ 

Remark: To interpret the coefficient aQ{t), let us consider a single indi- 
vidual. Since exp(— ^^.t) is the probability that hnk a has experienced no 
crossover event until time t (recall that we have assumed a Poisson process 
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of rate at link a), aQ{t) may be interpreted as the probability that the set 
of all links that have, up to time t, experienced at least one crossover event, 
is precisely G. 

Note that the above result relies on the assumption of single, indepen- 
dent crossover events, which is described by recombinators that commute. In 
more general models, with multiple, dependent events, the algebraic struc- 
ture is rather involved. This makes solutions much more cumbersome, or 
considerably less explicit in nature (for review, see [^, Ch. 6]). 

Let us come back to the meaning of Eq. ( [T6| ) in combination with The- 
orem 0. If ip^ denotes the flow of the IPL equation (|11]), we obtain, for all 
t > 0, the identity 

Vt = J2''oit)RG (17) 

GCL 

which is valid on the cone As usual, = 1 and ip^oip^ = ip^^^, for 

all t,s > 0. This implies the identity 



H,KCL 
HUK=G 

which can be verified by direct computation. More interestingly, we also have 



Fact 4 On Ai+{X) , the forward flow of (pT|) commutes with the recombina- 
tors, i.e. Rq o ip^ = ip^ o Rq, for all t > and G (Z L. 

Proof: Let u e M+{X) and fix G C L. Then 

HcL HcL 

= J2^Hit)RH{RGi^)) = ^t{RG{^)) 
HcL 

by an application of Corollary ^ □ 
Once the solution is known, the remaining task is to identify linear combi- 
nations of the Rfj{uj) that decouple from each other and decay exponentially. 
To this end, we employ combinatorial techniques to regroup the terms of the 
solution according to their exponential damping factors. Let us first expand 
the expression for aQ(t), 

KCG aeK 



22 



This suggests to define new functions bj^{t) via 

= exp(- J2 QJ) 



witli tlie usual convention tliat tlie empty sum is 0. In particular, we have 

^0(^) — '^0(^) = ^^P(~X]agL ^^(0) — 1 K C L. Now, the 

Mobius inversion of (@) and (H), used backwards, gives us the relation 



GCK 



One immediate consequence is 

J]ac(t) = = 1. (19) 

GCL 

So, together with the observation that the functions aQ{t) of Theorem ^ are 
always non-negative, we have independently confirmed 

Fact 5 If uJq E M.^{X), the coefficient functions a^lt) of Theorem^ con- 
stitute a convex linear combination of positive measures in Eq. (0). □ 

The significance of the new functions becomes clear by realizing that 
there is an analogue on the level of operators. To this end, we rewrite the 
composite recombinators in terms of new operators via R^^ = YIgdh '^g 
obtain, by an obvious variant of Mobius inversion, 

Tg ■■= E(-l)"'"'"^^^- (20) 

hdg 

A straight-forward calculation then reveals that 

= Y.^Git)RGi^o) = Y^^KmA^o) ■ (21) 

GcL K(ZL 

Note that, as a consequence of Eqs. ([I0|), (|12D and (l^); the operators Tq are 
positive homogeneous of degree one, i.e. 

T^au) = \a\ ■ Taiu) . (22) 

Let us now introduce new measures i^ci'^) '■— bcit) Tq{lJq), which are 
elements of Ai{X), but no longer positive in general. 
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Proposition 5 The signed measures i^ci^) solve the Cauchy problem 
with initial condition z^g(O) = Tq{uJq), for all G G L. 

Proof: The result is a direct consequence of the fact that the coefficient 
functions ^(^(t) solve the ordinary initial value problems 



bait) = -(Ea^GQo)bGit) 



with initial conditions bQ{0) = 1, see above. □ 
So, the transformation (|20|) resulted in regrouping the terms of the solu- 



tion to the IPL equation (11) according to their exponential decay factors in 
time. In particular, 



i=0 



is the unique limit measure of the process starting from uJq. Due to the action 
of R]^, it is a complete product measure and reflects total independence, and 
we obtain _R^(co'o) as t — cxd in the ||.|| -topology. This is so because 



KCL KCL 



where all remaining coefficient functions bj^{t), i.e. those with K <Z L, decay 
exponentially (recall that Qa > ^ for all a E L). 



3.3 Linkage disequilibria 

Starting from the measures ^'^(t), we will now identify a minimal, com- 
plete set of variables by evaluating certain fc-point cyhnder functions (called 
/c-point functions from now on) or correlation functions known as linkage 
disequilibria in genetics. They are important for data analysis because they 
allow to evaluate associations between sites up to a given order from mea- 
sured type frequencies, and average over all others by marginalization. This 
way, a certain amount of stochasticity, which is present in all real (finite) 
populations, is smoothed out. 
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Various different definitions of linkage disequilibria are available in the 
literature (see |TI|, p. 183-186] for an overview). But only special choices 
decouple (see [|TB], ^7^), and these are the linkage disequilibria we are after. 
In view of the applications, we will now restrict ourselves to the case that 
X is a finite set, although the results hold, with only minor modifications, 
also more generally. Eq. ( ^1]) and Proposition ^ suggest to employ the signed 
measures Tq{u!q). The corresponding functions bQ{t) will then describe their 
evolution in time. 

Let . . . with ii < ■ ■ ■ < jfc, symbolically denote a cyhnder set in 
X = X]^ which is specified at sites jj, for 1 < i < k. More specifically, these 
are sets of the product form 

. . . , J,) = X^o,...,i,-l} X {^j,} X [•••] X {^J X ^0,+l,...,n} 

where [...] contains factors {x^} or Xi depending on whether i appears in 
(ji; ••• ; Jfc) '^'^t- ^ ^ '^{-^) ^"^^ arbitrary a G L, we then have 

^{{ji,--,jk)) if a < ii or a > j;. 

.'^((Ji, ••• Js)) '^{{Js+i, ■■■ Jk)) if is < « < is+i ■ 

For later convenience, we also define (0) = X so that -R^(z/)((0)) = 1. 

Lemma 3 If u E V{X), we have TQ{v){^{j^, . . . = whenever the set 
G contains an element that is less than j-^ or larger than jg^. 



Proof: Let / = {/3 | j\ < /? < j^.}. Assume there is an a G G U /. Then 

TGi^){{jl,---Jk)) 

where the previous calculation was used in the second step, and summation 
is over H. Clearly, the last expression vanishes. □ 
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Let us now define the time-dependent fc-point functions as 

= Te(^i)((ji,...,Jfc)) (23) 

for arbitrary G G L, wliere tlie notation is again symbolic in tliat we only 
specify the positions j^, but not the corresponding values. To relate this to 
Eq. (pT|), we show 

Proposition 6 // cJq G Ai^{X), we have Tq{u^) = bQ{t) Tq{uJq) , for all 
G G L andt>0. 

Proof: Since bQ^O) = 1, equality holds for t = 0, and the claim follows if 
we show that TQ{Uf-) and ^^(t) Tq^ujq) satisfy the same differential equation. 
With LJ^ = ip^^Uo), compare (0), we obtain 

hdg 

= E(-l)"'~'"|^t(^H(^o)) (by Fact I) 
hdg 

= (byEq. (O) 

= E(-1)"'"'"E^^"(^^uw-^h)K) (by Fact I) 

HDG aeG 
aeG 

The last step is correct because 

by an argument analogous to the one used in the last step of the proof of 
Lemma ^ Now, a comparison with Proposition |^ establishes the claim. □ 

Even after Lemma ^ there are still too many functions around. It is thus 
reasonable to select an independent set from them. To see how to do this, 
assume that we have an index a G G fl /, with J = {/5 | j\ < /? < j^} for 
a cylinder set of type . . . as above. Let H he a. subset of L that 
contains G, so a G -ff in particular, and u G Vi^X). Then Rh{^) = Rai^n) 



26 



with = Rjj\^a}i^)- '^^^ little calculation before Lemma |^ now tells us 
that 

= [RMiih, ■ ■ • , Js))] ■ [RM (O.+i, • • • ,Jk))] 

= [Rj^iu] . . . , J,))] ■ [Rj,iu){{j^^,, . . . , J,))] 

where < a < Js+i- Consequently, defining Ii = {P \ < P < j^} and 
Js+i < < Ja;}' referring back to (PUD, we also get 

hdg 

HDG K^DH 

K2D{HUl2) 

where Lemma ^ was used in the last step to remove terms that vanish. This 
equation means that Tg,(i/)((j\, . . . , j^,)), whenever an a G G H I exists, 
either vanishes (if Lemma ^ applies) or is a polynomial expression in £-point 
functions with i < k. 

In the above calculation, u is an arbitrary probability measure, wherefore 
the equations apply to u^, for an arbitrary t > 0. Whenever G H I 7^ 0, 
the time-dependent fc-point functions are polynomially dependent of £-point 
functions with i < k. Consequently, they do not contain new information. 
So far, we have: 

Proposition 7 The k-pomt function F^\j^, . . . , j^) = T(.{uj^){{j^, . . . , j^)) 
can only be non-vanishing and (polynomially) independent from i-point func- 
tions with i < k if G = 7 = {P < j^} U {p > jfc}. □ 

We choose this collection of fc-point functions as our linkage equilibria. 

Let us finally observe that the summation of a fc-point function over all 
possible values of one of the specified Xi (i.e. marginalization) reduces it 
to a (A;— l)-point function, so we have one extra (linear) relation. This means 
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that only Mj — 1 possible values can be prescribed independently at site i. 
On the other hand, given {j^, . . . there is only one way to choose G due 
to Proposition and then there are [Mj^ — 1) ■ ... ■ [Mj^ — 1) different and 
independent choices to specify the actual values at the sites. Summing up 
all these possibilities results in 

n n 

E n(^-i) = n(i+(^-i)) = n^^ = i^i- 

DCN ieD i=0 i=0 

This means that we have singled out the right number of functions. In view 
of Proposition ^ for t arbitrary but fixed, they completely determine the 
value of the signed measures Tq{uj^) on all cylinder sets. These, in turn, are 
closed under finite intersections and generate the full a-algebra of the (finite) 
space X, so all measures Tq{u!^), and hence also uj^, are uniquely specified, 
and we have achieved our goal. An explicit example has been worked out 
in Section 4 of 0, where the fc-point functions . . . , j^) appear as the 

components of the vector z of linkage disequilibria, up to a change of basis 
in the local site spaces. 

If X is not a finite set, one has to use a generating family of Borel cylinder 
sets instead of just singleton sets, and invoke Fact |l|. Although there is no 
simple counting argument, the general structure is still similar. 

At this point, one could still argue that fc-point functions w.r.t. the se- 
lection of sites, as our F^{j^, . . . , j^), should be replaced by proper fc-point 
correlation functions because these separate off all contributions of functions 
of lower order, i.e. of £-point functions with i < k. This is just another 
application of the Mobius inversion principle, but one where all partitions 
(rather than only ordered ones) are needed. We provide the corresponding 
formulas in the Appendix. If one performs the necessary calculations, one 
quickly realizes that our previous inclusion-exclusion process w.r.t. ordered 
partitions of the links has far reaching consequences: most of the potential 
correction terms simply vanish, as a result of Lemma ^. In particular, we 
obtain 

Theorem 3 Let S = . . . , j^} be a set of site indices, in increasing order 
and without gaps, and let G = {a < j^} U {a > jj^} . Then, the k-point func- 
tion FQ\j^, . . . = T(j(c<jJ . . . ,jf.)) coincides with the corresponding 
k-point correlation function as given in Eq. (^31) of the Appendix. 

These functions, for all possible choices of the set S, form a polynomially 
independent set of linkage disequilibria. 
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Proof: We apply Lemma ^ with v = uJ^. Due to the assumption on S versus 
G, the right-hand side of Eq. boils down to the one term we already have, 
because all other terms vanish. Propositions |^ and ^ ensure the polynomial 
independence of these objects, which are our linkage disequilibria. □ 

This result does not extend to all fc-point functions. If, for a given k- 
point function, a non-vanishing correction term occurs in the corresponding 
correlation function, this will, in general, not decay with the same exponen- 
tial rate as the original fc-point function. So, grouping according to decay 
rates and according to correlation structures simultaneously is not possible 
in general. It is a rather remarkable fact that the set of linkage disequlibria 
is a set of exceptions, and one (as we demonstrated above for the case of 
discrete state spaces) that completely determines the probability measure. 



4 Mutation and recombination 

In this section, we will just combine the results of the previous two sections. 
This is possible because, as we will see, mutation and recombination are 
independent in our approach, i.e. the corresponding operators in the IPL 
equation commute. This is to be expected given the fact that mutation acts 
on the sites while recombination works via the links. However, to be able 
to formulate this in a more general situation than X finite or discrete, we 
now restrict ourselves to the Banach space = ^^z^ ■M.{Xi) which, as 
explained earlier, is meant as the completion of the algebraic tensor product. 
In general, it is a (true) Banach subspace of M.{X). Our IPL equation now 
reads 

^ = + J]^Ji?.-l))(u;) (24) 

■tGiV a&L 

where we have taken the liberty to introduce mutation rates /ij, all of which 
are assumed to be strictly positive. The idea behind this is to use some stan- 
dardized version for the mutation operators of (H) so that the /ij serve as 
relative coefficients, in line with the usual practice in the biological litera- 
ture. The linear operators are supposed to be bounded, hence continuous. 



and thus possess a unique extension to A^®, compare |^5|, Thms. II. 1.2 and 
II. 1.5]. To show consistency, we observe 
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Lemma 4 The Banach space Ai^ is invariant under R^, for all a G L, and 
hence positive invariant under the flow of (P^). 

Proof: It is clear that maps a finite linear combination of product 
measures onto another linear combination of this kind, compare the proof 
of Prop. ^. Since such linear combinations are dense in Ai'^ and i?„ is 
Lipschitz on Ai{X), it maps the closed subspace A^® of Ai{X) into itself. 
The statement on positive invariance is a direct consequence of Thm. 16.5 
and Remark 16.6]. □ 

Let be the subspace of probability measures in TW®. Referring back 
to Propositions |I] and ^ and to Theorem |I|, the following result is immediate. 



Proposition 8 The abstract Cauchy problem of the IPL equation (0), with 
initial condition oJq G A^®, has a unique solution. The cone is positive 
invariant, and the norm of a positive measure is preserved in forward time. 
In particular, the convex set V® is positive invariant. □ 

To continue, let us call a positive linear operator W on }A® strictly pos- 
itive if a; G M.® with a; > implies Wu) > 0. The key observation is now 



Lemma 5 Let W be a strictly positive bounded linear operator on Ai® which 
has a complete tensor product structure, i.e. W = w^®- • -^w^^. On J^%, the 
elementary recombinator R^ then commutes with W, i.e. WR^ = R^W . In 
particular, this is true if W = exp{tQ-) is an element of a Markov semigroup, 
as in Section for any t > 0, i & N and a & L. 

Proof: Let us first consider the case that W preserves the norm of a positive 
measure i.e. ||W^z^|| = W^W- This is also true of R^, a E L. Since W is 
linear and R^ positive homogeneous of degree 1, it is sufficient to prove the 
claim on V®. So, let u G V®. W has a complete tensor product structure, 
so W = W^^ ® particular. Observe first that W^^ o tt^^, = vr^^ o W 

and W^^ o TT^^ = vr^^ o W . These relations certainly hold when applied to a 
product measure v = i^^a®^>ay tiut, due to linearity of all mappings involved 
here, also on arbitrary (finite) linear combinations of measures of this kind. 
The latter are dense in so that continuity of the mappings establishes 
the relations, compare |^5|, Thm. II. 1.5]. 
As a consequence, we obtain 
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which proves the assertion for the case that W preserves the norm of u. 

Let us now consider the general case. The proof so far only required that 
W preserved the norm of the single u under consideration. We employ again 
positive homogeneity of R^. If z/ > 0, we have Wu > by assumption, so 
that a := J^},, > is well defined. So we obtain ||z/|| = 11 aW^z/ 1 1 and 

\\Wu\\ II II II II 

WR^iu) = ^JaW)RM 

= ^R^i^aWu) (by above argument) 
= R^iWu) (byEq. (0)) 

which proves the first assertion. 

The second claim is obvious because elements of a Markov semigroup are 
strictly positive and because the generators Qi, compare Eq. have the 
required product structure. □ 

We can now put together our previous efforts. The obvious form of the 
solution of (p^ is now 

LJt = exp(tg) aG(t)i?GK) (25) 

GCL 

with Q = X]r=o f^iQi coefficient functions aQ{t) of Theorem |^. The 

verification that this indeed solves the IPL equation is a simple application 
of the product rule. Let z/^ := J2gcl ^ci^) -^g('^o)' ^'^ ^^^^ — exp(t(5) z/^. 
Then we have 

= Quj^ + exp (tQ) z>^ 

= QiUt + exp (tQ) J2 Qa {Ra - 1) (^J (by Theorem |) 

= Q + 0a{Ra - l)) (exp(tQ)z/i) (by Lemma |) 

aeL 

So, together with Proposition ||, we have established: 

Theorem 4 The unique solution of the IPL equation ( P^ ) , with initial con- 
dition uJq G Aif, is given by Uf- of (P^D, with the coefficient functions aQ{t) 
of Theorem ^. □ 
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Let us take a closer look at the asymptotic behaviour. Since aQ{t) de- 
creases, as t ^ cxD, exponentially to unless G = L, we obtain 

n 

~ exp(tQ)(ai(t)i?i(a;o)) ~ exp(tg) (tt^.^q) 

i=0 

n 

= (g) (exp(tgJ(7r,.a;o)) 

where all neglected terms are of lower order in that they vanish exponentially 
(recall that exp(tQ) and exp(tgj) are Markov). This shows that the stationary 
measure, for any initial measure cJq, is again a complete product measure^. 
Whether or not there is a unique global equilibrium measure then depends 
on the properties of the local mutation operators q^. In the case that X is 
finite, uniqueness follows if all these generators are irreducible. 

What remains to be done is to extend the Mobius trick and to evaluate the 
linkage disequilibria also for this case. Due to Lemma |, we can equivalently 
write oj^ of (|25|) as 

= ^ ^ait) Ra {exp{tQ) uJq) . (26) 

GCL 

At any fixed instant of time, exp{tQ) Uq is a positive measure, and we can 
employ Eq. (pTf ) to obtain 

= bKit)TK{exp{tQ)uj,) (27) 

KCL 

with the functions b^(t) introduced in (|T^). 

If we now assume again that X is finite, we can use the fc-point cylinder 
functions as before to select a finite set of linkage disequilibria that completely 
determine the solution tu^. They are the functions 

FhUv-.Jk) = Ta{uJt){{Ji,---,Jk)) (28) 

for G C L and selected cylinder sets . . . , j^) exactly as before. 

Since mutation and recombination are independent of each other and 
the time evolutions commute, we can separate the time decay due to the 
two processes. The effect is as follows. Recombination is sensitive to sites 

■^Convergence to product measures is also known from various interacting particle sys- 
tems, compare 
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selected in the cylinder sets, but not to the actual values prescribed there. 
Mutation, in turn, has a tensor product structure with respect to the sites 
(which expresses the independence of individual events). 

If exp(tQ) is Markov (so that Lemma ^ applies), it is easy to derive (in 
analogy with the proof of Proposition ^ that 

jT^{u,) = {Q-J2(^^)TGi^t)- (29) 

This shows how the recombination rates and the eigenvalues of Q together 
determine the fine structure of exponential decay. Note that diagonalizing 
Q (if at all possible) now corresponds to taking appropriate linear combina- 
tions of T(^(c<jJ . . . , j^)) for fixed G and j^, . . . ,jf^, but different values 
prescribed at the sites. For finite X, this has been worked out in 0, along 
with explicit examples. 



5 Selection 

Let us first look at selection in a slightly more general way, i.e. via an IPL 
equation on J\4 (X) without explicit reference to its tensor product structure. 
Let P : Ai{X) Ai{X) be a bounded linear operator which generates a 
positive semigroup. According to Thm. 1.11], the latter is true if and 
only if P satisfies our assumption (A2), the positive minimum principle. 
Consider now the ODE 

d; = ^^Juj) := Pu-^^^^^LJ (30) 

\\uj\\ 

where $soi(0) = is the proper extension of $^^[(^7) to u = 0. This is 
motivated by the standard selection model (cf. [^), where, in properly co- 
ordinatized form as indicated in Section |l], P is a diagonal matrix which keeps 
track of the 'fitness' of the various states, and -nr^ is the 'mean fitness' of 
the population. This model also arises in the infinite population limit of the 



well-known Moran model, see |^ or |^2|, Ch. 3]. Here, in a population of m 



individuals with finite state space X as described in Section |^, every indi- 
vidual of type X reproduces at rate r^, and the offspring replaces a randomly 
chosen individual in the population (possibly its own parent). Therefore, a 
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transition from population state z to z + Ux — Uy occurs at rate rxZxZy/m. 
Along the lines of Section ^ the limit m oo yields a special case of the 
differential equation (^), where P is the diagonal matrix with elements r^. 

The more general form used here does not only cover more general X, 
but also interaction between mutation and reproduction (as opposed to the 
independent processes considered so far), e.g. the production of mutated 
offspring on the occasion of reproduction. In any case, the subtraction of 
the second term on the right hand side of (|30D comes from the preservation 
of total mass, or, in more technical terms, is designed so that $^^1 satisfies 
assumptions (A2) and (A3) from Section ^. 

So far, our selection equation seems to imply that selection acts on hap- 
loids (i.e. individuals with only one copy of the genetic information per cell). 
If, however, individuals have two copies that are equivalent and do not in- 
teract (the diploid case without dominance), Eq. (|30D is replaced by 



M{u;® Plu + Plu^u) {M{uj ® Plo + Plo ® uj)){X) 

^ — n — n ^ n — MP ^' (2-'-) 

ll^ll ll'^ll 

where M(/i ® i') '■= J^{X) ■ fi denotes marginalization with respect to the 
second factor. In this formulation, the mean fitness is 

^(Miu^pu+Pu^umx) = 2 . 

Il^ll ll*^!! 

For positive u, the right-hand side of ( |3lD becomes 

Puj(X) Puj(X) PuoiX) 

„ \, ' uo + Pu - 2 „ V/ uj = Puj uo , 

\\uj\\ \\uj\\ \\uj\\ 

that is, the diploid equation reduces to the haploid one in this case, in the 
sense that the flow is the same on A^ + (X). 

Let us now take a closer look at the differential equation (|30|). 

Fact 6 The mapping ^ggj: M{X) M.{X) is {globally) Lipschitz. 

Proof: Consider u,uj' G Ai{X). If one of them is the zero measure, u' say, 
we get 

||$seiM-'^sci(o)|| = ll^-seiMll < 11^^11 + ^^rr^ll^ll 

ll^^ll 

< 2 ||Pu;|| < 2 IIPII 
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where ||P|| := sup|j^||<i \\Puj\\ < oo because P is a bounded operator by 
assumption, and clearly |Pti;(X)| < iPci^KX) = UPcijII. 
Let now uj,uj' both be non-zero. Then 



|*selH-'^'sel(^')|| < ll^lllk-^'l 



Puj'(X) , PujiX) 
— -. — -. UJ -. — -. UJ 



\UJ' 



\UJ\ 



Observe that 



Plo(X) 



P(l0ll) (X). The second term on the right hand side of 
the above equation is then clearly majorized by ||P|| \\uj' — uj\\ + c ||cij|| where 



< 



P 



UJ 



\UJ' 



UJ 



\UJ\ 



< 



IPI 



\UJ\\ \\UJ' 



Next, observe that 



P 



UJ 



\UJ' 



UJ 



\UJ\ 



{X) 



\uj\\ uj' — Wuj'W UJ 



\UJ\\ UJ 



'-Wuj'Wuj < ||c.'||l||a;||-||a;'||l + ||a;'||||a;'-a;|| < 2\\uj'\\\\uj - uj'\ 



so that we finally get 

||*.elH-*sel(^')|| < 4||P||||..-..'||. 

Together with the previous calculation, we see that $^^1 is globally Lipschitz, 
with Lipschitz constant < 4. □ 

So, we know that the IPL equation (PDj ) defines a unique fiow. As before, 
we have to check what happens with M.^{X) under the semifiow in forward 
time. Since cUq = trivially implies uJ^ = Q for all t > 0, we exclude this 
case from now on. Note that cJq 7^ results in ||a;^|| > for alH > 0, due to 
uniqueness. Let uj G M.^{X) be a positive measure and E a Borel set such 
that uj{E) = 0. This implies (E) = [Puj) {E) > because P itself 

satisfies the positive minimum principle (A2) by assumption. Also, for any 
UJ E 7W+(X), we have 



(X) = Puj{X) 



Puj{X) 



\UJ\ 



uj{X) 







because uj{X) = \\u\\ for positive measures. Together with Fact ||, we see 
that assumptions (Al) - (A3) are satisfied, and we can invoke Theorem 0. 
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Proposition 9 Assume that the linear operator P is bounded and satisfies 
(A2). Then the abstract Cauchy problem of the IPL equation ( pUD with initial 
condition uJq has a unique solution. The cone of positive measures is positive 
invariant under the flow, and the norm of positive measures is preserved. In 
particular, V{X) is positive invariant. □ 

Remark: We would like to mention that the assumption of bounded P is 
somewhat restricted. For non-compact X, many interesting selection models 
lead to unbounded P. For mutation and selection alone, the more general 



situation has been investigated in [EO] and, more recently, in [O, in the 



framework of analytic semigroups, compare |19, Ch. II. 4. a]. Our emphasis 
here is on the basic structure that emerges from the interaction with recom- 
bination; this will also carry over to more general cases. 
Before we proceed, let us make the following observation. 

Fact 7 If the linear operator P is hounded and satisfies the positive minimum 
principle, the same is true of P' = P + cl for arbitrary c G M. Furthermore, 
the flow of the IPL equation (|30D on Ai^{X) remains unchanged if P is 
replaced by P' . 

Proof: If i/ is a positive measure and E a Borel set with = 0, then 

P'u{E) = Pu{E) + cu{E) = Pu{E) > because P satisfies (A2) by assump- 
tion. Since P' is still bounded, the IPL equation (^) with P' in place of P 
conforms to Proposition If G M.+{X), we obtain 

P'u{X) ^ Puo{X) cuo{X) ^ Puo{X) 

P'uj frV^^ = P^^ + cuj frV^^ irir^^ = frV^^ 

||ci;|| \\uj\\ \\uj\\ \\uj\\ 

from which the claim follows. □ 

Once again, although the ODE (^) is nonlinear, it can be solved in closed 



terms. This time, we employ Thompson's trick |Q through the substitution 



Vt = m^t (32) 
where lj^ is a solution of (|30|). One then obtains 



V WujAI / 



and a significant simplification is reached if the term in brackets vanishes 
because the remaining ODE is then linear. This is achieved by the choice 



^{t) = exp I / ^^^^r^ I = exp I I PuJ,{X) dr 1 (33) 



'0 ll'^rll / V ll'^Oll ^0 
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where the second step follows from Proposition ^. Clearly, is well defined 
(whenever Uq 7^ 0, which is all we need), and we have reduced the Cauchy 
problem of (|5D|) to that of the simple linear evolution equation 

r] = Pr]. (34) 

This ODE defines a uniformly continuous positive semigroup (since P was 
assumed to be bounded and to satisfy (A2), the positive minimum principle). 
The solution of (|^) will no longer have fixed norm, but one can always get 
back to ujf- via 



WvtW 



Note that rj^ = Uq and \\uj^\\ = 

Let us next consider the function 

L{t) = '\ ' (35) 



which is defined on any orbit of the flow of (PO). L{t) is of particular interest 



on orbits of positive measures, where it admits the interpretation as mean 
(or averaged) fitness. Here, we know \\uJt\\ = ll^oll by Proposition so that 
we obtain 
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which has the form of a variance. So we can state 

Proposition 10 //, under the assumptions of Proposition P satisfies the 
condition {Puj{X))'^ < P^a'(X) on V{X), the function L{t) of (|35D is a 
Lyapunov function for the flow of (pOf) on the positive cone M.+{X). 



Proof: From the above calculation, it is clear that L{t) > on all orbits 
in M+{X) if P satisfies the inequality iPuj{X)Y < P'^uj{X) on P(X), so L 
cannot decrease along such an orbit. □ 
Remark: Our definition of a Lyapunov function on }A^{X) is global and 
(up to a sign) that of [|, Ch. 18]. Note that the stricter version of [^, where 
L{t) = would correspond to a unique equilibrium on A^+(X), is not so 
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useful here because the asymptotic state (as t oo) of the selection equation 
depends on the initial condition, i.e. there is no unique equilibrium in general. 
However, one might profit from the use of local Lyapunov functions, compare 



26| , Thm. 1.0.2 (iii)], but we do not expand on this here. 
The condition on P can be reformulated by noting that 

P^uj{X) - {Pio{X)f = {P-cl)^uj{X) 

with c = Puj{X). A sufficient condition for Proposition |T0| to hold is then 
that (P — cl)^ is a positive operator for all (or sufficiently many) c G M. A 
particularly well studied case of this is when X is finite and P is a diagonal 
matrix in the canonical basis consisting of the extremal measures of V{X). In 
this case. Proposition [T^ is known as Fisher's fundamental theorem, see ||27]] 
for details. In the more general case, Lyapunov functions may be considered 
even more important since they determine the 'direction' of the evolution 
process in a situation where little information is available otherwise, since 
the solution given by Eq. (|36D is not very explicit then. 

The results of this section can also be formulated for the (sub-)space 
of Ai{X), if it is invariant under the action of P. In view of the product 
structure of X, let us now assume that we have P = XlILo with bounded 
Pj that are locally represented by (as with versus Qi before). Clearly, P 
maps A^® into itself. We call this situation additivity across sites, in complete 
analogy to our previous discussion of mutation. We can then rewrite our 
solution as 

n 

Vt = exp(tP)r/o = (^(g)exp(tp.))r7o. (36) 

i=0 

With some further restrictions on the linear operator P, an analogue of 
Proposition |TD| remains true even in the presence of recombination. This 
rests on the applicability of Lemma ^. We thus consider the IPL equation 

CU = Pu;-^^^^u; + J20aiRa-^)i^) = + ^reci^) (37) 

whose Cauchy problem has all the nice properties we need, see Proposition 
|TT] below in the special case Q = 0. We now assume: 

1. P has complete product structure as a generator, i.e. P = XliLo with 
Pi = \®---®\®Vi®l®---®\. 
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2. Each Pi is itself a bounded, strictly positive operator. 

we again define L{t) as in (RSf) and obtain, by 



If is a solution of (|3 



Lemma ^ 

L(t) = 



1 



P' 



Puj,{X) 



p 



+ 



=0 rec 



The last term vanishes due to our general assumptions because -PjCU^ > 
and then (^^^^{P^uj^){X) = due to (A3). So, we are back to the condition 
already encountered above. To summarize: 

Theorem 5 Let P = ^2^=0 Pi satisfy the assumptions of Proposition and 
let each Pi be a bounded, strictly positive operator with complete product 
structure. If P also satisfies the condition {Puj{X)y < P'^uj{X) on V® , the 
function L{t) of ( |35D is a Lyapunov function for the flow of (|37D on the 
positive cone Ai"^. □ 

In the absence of recombination, there are other Lyapunov functions 
known for certain combinations of selection with mutation. They rely on 
the spectral theorem applied to P + Q, see In selection- recombination 
equations where P violates the product structure, the mean fitness L{t) need 
no longer be a Lyapunov function. Moreover, the possibility of periodic so- 
lutions demonstrates that, in more general (diploid) models (e.g. with 
dominance), no meaningful Lyapunov function is to be expected. 



6 All three 

In this last step, we combine all three processes, with the general assumptions 
as before. In view of the inherent product structure, we only consider the 
dynamics on the Banach space Ai®. The IPL equation now reads 

^ = '^mut(^) + '^rcc(^) + '^scl(^) (38) 

= {Q + P)u:-^^u; + J20a{K-i)i^) 

and we immediately get the following result, again from Theorem |I], and 
Lemma ^ 
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Proposition 11 Let Q be a bounded Markov generator and P a bounded 
generator of a positive semigroup, both of product form. Let he the recom- 
binators of Eq. (^). Then, the abstract Cauchy problem of the IPL equation 
(|38|) has a unique solution. The cone M.'^ is positive invariant and the norm 
of positive measures is preserved under the forward flow. □ 

Remark: Since Q is a Markov generator, we know from Section that 
Quj{X) = for all u G A^f, and we could also start from an IPL equation 
where Q is absorbed into P — it would give the same flow on A^®. We 
retain the separation into mutation and selection because, in more general 
situations, it is often adequate from both the biological and the mathematical 
point of view (for example, the mutation operator is usually bounded, but 
the selection operator may be unbounded); for review, see [jll], Ch. IV]. We 
will also combine Q and P, but only after Thompson's linearization trans- 
formation. 

Let ujf, t > 0, he the solution for initial condition uJq. Define rj^ as above 
in (|3^), with i!}(t) of (|33D. Then, is a solution of ( p8D if and only if rj^ solves 



the reduced IPL equation 

T] = Sr] + ^^^M (39) 

where S = Q + P is the bounded generator of a uniformly continuous semi- 
group of positive operators. Note that the right hand side of (^) still satis- 
fies assumptions (Al) and (A2), but no longer (A3). So, the corresponding 
Cauchy problem still has a unique solution, with Al® being positive invari- 
ant, but the norm of positive measures need no longer be preserved under 
the flow in forward time — and this is precisely the point of this exercise! 

From now on, we generally assume that both mutation and selection are 
adapted to the special product form of our state space X, so S = XliLo 
(with corresponding local operator sj. Hence, exp(tS') is again a tensor 
product of local operators. 

Lemma 6 If S = XliLo ^'^ bounded generator of a uniformly continu- 
ous semigroup of positive operators, then we have exp(tS)R^ = R^exp{tS) 
on Mf, for all t > and a e L. 

Proof: Fix t > and set W = exp{tS). This is a positive operator by 
assumption. Also, since S is bounded, u > implies expitS)^ > and W is 
strictly positive. The result then follows from Lemma ^. □ 
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This result means that we can use all our above methods again and con- 
struct immediately the solution of (PW). At this point, we particularly profit 



from our approach in that we can still solve the case with (additive) selec- 
tion. In the context of Haldane linearization, any form of selection has, so 
far, appeared as a major obstacle, due to the fact that the fiow induced by 



P fails to preserve the norm of positive measures |^ 



Theorem Q If S = Y17=o satisfies the assumptions of Lemma ^ the so- 
lution of the reduced IPL equation (^), with initial condition 1]^ & A4^, is 
given by 

Vt = exp(tS) ^ a^lt) i?G(%) 

GCL 

with the coefficients aQ{t) of (|13|). The solution of the abstract Cauchy prob- 



lem for the original IPL equation (|38|) emerges from here via 



\m\ 

where Uq = rj^. If uJq G V{X), then {uJ^ \ t>Qi} is a one-parameter family of 
probability measures. □ 

In line with our previous reasoning, we can determine the asymptotic 
behaviour, 



Vt - ® (exp(tsJ(7r..r/o)), 

where we have used the product structure of exp(tS') and the fact that all 
neglected terms, as t oo, are exponentially small in comparison. The 
meaning for uj^ is, once again, that stationary measures are complete product 
measures, and the properties of the linear operators s^ determine whether 
there is a unique global equilibrium measure. This is connected to the general 
Perron-Frobenius theory of positive operators which is rather involved in 
general, see |]42| , Ch. V.5] and |2^. If, however, X is finite (so that }A® is 



finite-dimensional, and A^® = M.[X)) and all s^ are irreducible, there are 
unique G V{X^) so that exp(tSj)z/^ = exp(tAj)z/^ with Aj G M being the 
largest eigenvalue of s^. In this simple calculation shows, we obtain 

in the ||.|| -topology, as t oo, for any initial condition oJq G V{X). 
Also, the following observation results immediately from Theorem ^ 
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Corollary 3 // an initial condition uJq G is a product measure at link 
a & L, this is also true of the corresponding solution uj^ of (|5BD, for all 
t > 0. In particular, if uJq is a complete product measure, this remains the 
case under the forward flow, i.e. for all uj^ with t > 0. □ 

Let us return to the general discussion. The remainder is then a copy of 
what we did in Section ^, with Q replaced by 5*. In particular, we get 

'^t = ^b^{t)Tj^{exp{tS)T]f^) 

KCL 

from which one can, once again, determine the linkage disequilibria. Note, 
however, that the meaning has changed now, because the norm of rj^ varies 
with time. In particular, one has to consider the quotient ^i/||?7f||, rather 
than T]^ alone, to extract the correct behaviour for the linkage disequilibria 
FgUi, • • • , Jfc) = ^G(^t)(0'i' • • • 'ifc))- To be concrete, observe first that 

a&G 

in perfect analogy with (^5]). Since Tq is positive homogeneous of degree one 
(Eq. (|2^) ), and \\rj^\\ = rj^{X) for positive measures, one obtains 

Clearly, knowledge of the mean fitness, SuJ^{X) /\\uj^\\, is now required to 
determine the dynamics of the linkage disequilibria. 



7 Afterthoughts 

In this article, we have constructed an explicit solution of the single-crossover 
recombination model in continuous time, with mutation and additive selec- 
tion. It is quite astonishing that such a solution should be possible at all - 
after all, it is an explicit representation of a nonlinear semigroup. However, 
it is no coincidence that this works in continuous time, rather than in dis- 
crete time. Let us discuss this for recombination alone. The discrete-time 
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analogue of our single- crossover model is the so-called model with complete 
interference [|T3|: 

^n+i = ^o^^o^M + (l - XI • ^^^^ 

Similar as this may look to its continuous-time relative, the probabilistic 
structure is quite different. Single crossovers in continuous time imply in- 
dependence of links, as expressed in the coefficient functions (0) and the 
resulting factorization property (Lemma |^). In contrast, a second crossover 
is inhibited for the duration of an entire generation in discrete time, due to 
interference of crossovers with each other (hence the name); see also 



As a result, independence is lost, which makes the discrete model inherently 
more difficult. 

Of course, this also applies to the situation with selection. Models of 
recombination and selection based on independent sites and finite site spaces 
have been thoroughly investigated in the population genetics hterature, see 
22| , p3| , pO| , |3T| , 0, ^ for some key references and |Tl], [1^ for recent compre- 



hensive reviews. Independence of sites with respect to selection is reflected 
by a tensor product structure of P, may be interpreted as lack of interaction 
between genes, and is known as absence of epistasis in genetics. More pre- 
cisely, since the dynamical systems mostly considered so far were in discrete 
time, a comparison with our setting is more adequate at the level of the 
semigroup, rather than that of the generator. 



Two notions of independence have been used, compare [^, |Tl[], which 
would translate into our setting as either exp(P) = Yli exp(Pj) = exp(pj) 
('multiplicative fitness') or as exp(P) replaced by X]j6xp(Pj) ('additive fit- 
ness'). Previously, much emphasis has been on the effects of dominance (i.e. 
the interaction between the two alleles joined in a diploid genotype). This 
may lead to multiple equilibria, which need not all be of product type, and 
astonishing differences in the qualitative behaviour of the multiplicative and 



additive scenario are observed, as reviewed in 30, nU. However, these ef- 



fects are absent if there is no dominance (as in our model); in particular, all 
equilibria are then of product type. Thus, our simple continuous-time model 
might well serve as an exactly solved reference case which also captures the 
qualitative features of the corresponding models in discrete time, although 
no explicit solution is available there. 

Now, the logical next step would be to extend the analysis to the inclusion 
of interactions between sites, which occur as soon as selection is no longer 
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additive across sites. Alas, this is much more involved, and even the simplest 
cases go far beyond what we have outlined above. The reason is that selec- 
tion now forces the introduction of further terms in the right hand side of the 
IPL equation so that the corresponding semigroups no longer commute with 
recombination. Nevertheless, several situations can be envisioned that admit 
at least a perturbative approach. In line with the single-crossover assump- 
tion, an expansion for small recombination rates would be appropriate, in 
contrast to the well-known quasi-linkage-equilibrium approach for large re- 
combination rates (for review, see [|11[1)- We hope to report on some progress 
in this direction soon. 



Appendix: Moments versus correlations 

As mentioned above, it is often desirable to separate effects that stem from 
mutual interactions of differently many "particles" or, as in the above discus- 
sion, from specification at a different number of sites. For two sites, correla- 
tion C and moments F are related by C{{i,i}) = F{{i,j}) — F{{i})F{{j}), 
where the arguments are meant as symbolic labels. Since this is a rather 
general structure, we briefly describe its systematic treatment by means of 
Mobius inversion, also known as inclusion-exclusion principle. 

Let S = {1,2, . . . , k} be a finite set which will serve as the index set 
of the particles or the specified sites, the latter through {j^, . . . , j^) . Let 
A = {A^,...,Ap} be a partition of 5*, i.e. S is the disjoint union of the 
members of A. Unlike before, the partition need not be ordered. Let the 
partition B = {B^, . . . , B^} be a refinement of A, so that 



A^ = B^ U ■ ■ ■ U 5, , . . . , A = 5, U ■ ■ ■ U 5,. 

^ Jl,l ■'l,n, P Jp,l Jp,n 



V 



where {{j^ i, . . . , ji,„J, . . . , . . • , is a partition of {1, . . . , g}, hence 

n^ + . . .+np = q. We write B ^ A'm. this case, where ^ defines a partial order 
which makes S into a poset. The corresponding Mobius function, compare 



T0| , p. 86], is given by 



KB. A) = llhir'~\n,~l)\ (41) 



i=l 



(_l)p+-i+-+"p(^^_l)!.....(^ 1)1 
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If C is any refinement of A, fi satisfies the formula 

1 if^ = C 



otherwise. 



Let us now, for a partition A = {A^,...,Ap}, introduce the function 
F{A) = F{Ai) ■ . . . ■ F{Ap), and similarly for the correlations, C. These 
quantities are related by 

F{A) := J2 C(^) = E n (42) 

B^A B=4A BeB 

because this precisely reflects the idea to separate off contributions from 
subsets of different cardinality. The Mobius inversion formula then gives the 
following formula for the special case that A = {A}: 

B 

B4A B=4A i=l 

where \B\ denotes the number of sets in the partition B = {B^, . . . 
The following example might illustrate this formula: 

C({1,2,3}) = F({1,2,3}) + 2F({1})F({2})F({3}) 

- F({1})F({2,3}) - F({2})F({1,3}) - F({3})F({1, 2}) 

which is to be compared with 

F({1,2,3}) = C({1,2,3}) + C({1})C({2})C({3}) 

+ C({1})C({2,3}) + C({2})C({1,3}) + C({3})C({1,2}) 

according to (|1^). Let us finally remark that formula ( ^5D can be applied 
factorwise if ^ = {A^, . . . , A^} because then C{A) = C{Ai) • . . . • C{Ap) by 
definition. 
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